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Abstract Life expectancy keeps growing and, among 
elderly people, accidental falls occur frequently. A sys¬ 
tem able to promptly detect falls would help in reduc¬ 
ing the injuries that a fall could cause. Such a system 
should meet the needs of the people to which is de¬ 
signed, so that it is actually used. In particular, the 
system should be minimally invasive and inexpensive. 
Thanks to the fact that most of the smartphones embed 
accelerometers and powerful processing unit, they are 
good candidates both as data acquisition devices and 
as platforms to host fall detection systems. For this rea¬ 
son, in the last years several fall detection methods have 
been experimented on smartphone accelerometer data. 
Most of them have been tuned with simulated falls be¬ 
cause, to date, datasets of real-world falls are not avail¬ 
able. This article evaluates the effectiveness of methods 
that detect falls as anomalies. To this end, we compared 
traditional approaches with anomaly detectors. In par¬ 
ticular, we experienced the kNN and the SVM methods 
using both the one-class and two-classes configurations. 
The comparison involved three different collections of 
accelerometer data, and four different data representa¬ 
tions. Empirical results demonstrated that, in most of 
the cases, falls are not required to design an effective 
fall detector. 
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1 Introduction 

Falls are a major health risk that impacts the quality of 
life of elderly people. When a fall occurs, a prompt noti¬ 
fication would help in reducing the injuries that the fall 
could cause. An effective fall detection system should 
address the following requirements (Abbate et al, 2012): 
1) automatic notification of occurred falls; 2) prompt¬ 
ness in order to provide quick help; 3) reliability of the 
fall detection techniques; 4) communication capabilities 
in order to alert the caregivers; 5) usability in order to 
facilitating users’ acceptance. 

Several solutions have been proposed: some of them 
addressing the problem as a whole, and others focusing 
on one specific requirement. The contribution of this 
article is related to the reliability of the fall detection 
techniques. 

Several factors characterize a fall detection tech¬ 
nique: from the sensors used to acquire data, to the 
features extracted; from the algorithms used to detect 
falls, to the types of datasets used to train the algo¬ 
rithm. The approaches that have been proposed differ 
for the choices with respect to those factors. 

For what concerns data acquisition, ambient sen¬ 
sors, wearable sensors, or a combination of the two, 
are the principal data sources used in these techniques 
(Mubashir et al, 2013; Liming Chen et al, 2012). Many 
recent approaches investigate the possibility of using 
the sensors provided by smartphones (Medrano et al, 
2014; Sposaro and Tyson, 2009; Abbate et al, 2012), 
which are widespread and require almost no installa¬ 
tion or set-up. Moreover, they do not introduce any 
additional cost, can be used in any place, and are ac¬ 
cepted by end users because they are already part of 
their everyday life. 
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The techniques also differ in the data type used to 
detect falls. Most popular fall detection techniques ex¬ 
ploit accelerometer data as the main input to discrimi¬ 
nate between falls and activities of daily living (ADL). 
Fig. l.a shows an example of accelerometer data repre¬ 
senting a fall that are extracted from the dataset pro¬ 
vided by Medrano et al (2014). In particular, the fall 
was recorded with a smartphone Galaxy mini. Fig. l.b 
illustrates the accelerometer data recorded by two sen¬ 
sors respectively placed on a Galaxy S II (from the 
dataset by Anguita et al (2013)) and a Galaxy Nexus 
(recorded by ourselves). These data capture the walk 
performed by two different subjects. It is possible to 
notice that the captured data share a general trend. 
This suggests the possibility of defining a method for 
the detection of falls that can be general and indepen¬ 
dent from the specific devices. 

To verify the effectiveness of the method used by 
the technique to detect falls, data acquired by the sen¬ 
sors are arranged into labeled datasets containing both 
ADL and falls, usually simulated by volunteers. Of¬ 
ten, datasets are elaborated in order to obtain features: 
from simple raw data to more complex indicator (e.g., 
magnitude and Fourier transform) whose processing re¬ 
quires time and computational resources. Methods can 
be principally divided into two main categories: domain 
knowledge- and machine learning techniques-based 
(Mirchevska et al, 2014). The approaches currently pro¬ 
posed, regardless of their classification, have in common 
the fact that they require a set of falls in their train¬ 
ing phase. Unfortunately, human simulations are signifi¬ 
cantly different from real-world falls (Klenk et al, 2011), 
and this could make those fall detection techniques not 
feasible for real-world applications. 

For this reason, Medrano et al (2014) experimented 
the use of a machine learning technique based on one- 
class classifier that has only been trained on ADL to 
detect falls as anomalies with respect to ADL. In par¬ 
ticular, their experimentation was conducted with a 
k-Nearest Neighbour (kNN) classifier. As data repre¬ 
sentation they used the magnitude that does not re¬ 
quire an huge amount of resources to be calculated. 
Medrano et al (2014) experimented on a publicly avail¬ 
able dataset containing both ADL and falls simulated 
by several human subjects and recorded by the same 
device. Moreover, Medrano et al (2014) also experi¬ 
enced a two-classes Support Vector Machine (SVM) on 
the same dataset. SVM has produced slightly better 
results with respect to the one-class kNN. Thus, they 
concluded that anomaly detectors are infeasible in de¬ 
tecting falls. Finally, in the article they explicitly state 
that data are acquired by accelerometers mounted on 
smartphones. This suggests that it was taken into con¬ 


sideration the idea of running the analysed methods 
on smartphones. From our point of view, a smartphone 
hardly support the execution of a SVM ensuring good 
performance. Indeed, as Mazhelis (2006) states, SVMs 
feature very high computational requirements for train¬ 
ing. Since the final aim is also to provide a continuous 
learning system, the high complexity of the training 
phase is critical when the deployment occurs on a mo¬ 
bile device with limited computational power and, most 
of all, with limited power resources. Indeed, energy con¬ 
sumption is today one of the main issue in mobile com¬ 
puting, especially when dealing with physical sensors 
(Pejovic and Musolesi (2015)). 

The aim of our work is to evaluate the effectiveness 
of methods that detect falls as anomalies with respect 
to traditional approaches that use two-classes classifiers 
to distinguish between falls and ADL. We compared 
anomaly detectors based on one-class kNN and SVM 
with traditional detectors based on two-classes kNN 
and SVM. We considered four different data represen¬ 
tations calculated from accelerometer data acquired by 
smartphones: raw data, magnitude, accelerometer fea¬ 
tures, and local temporal patterns. We evaluated the 
classifiers with respect to the variations of acquisition 
conditions: different sensors, different human subjects, 
different sensor positions. All the experiments have been 
conducted on two publicly available datasets (Medrano 
et al, 2014; Anguita et al, 2013) of accelerometer data 
acquired by smartphones. Evaluation metrics, such as 
area under the curve (AUC), sensitivity and specificity, 
confirmed, in most of the cases, that anomaly detec¬ 
tion techniques are quite robust against variations of 
acquisition conditions. 

The rest of the paper is organised as follows: Section 
2 introduces the motivations of our work and discusses 
related work; Section 3 outlines the experiment design; 
Section 4 presents the results of the experimentations; 
Section 5 discusses the achieved results; finally Section 
6 provides some details about the future directions. 

2 Motivation and Related Work 

In the near future the number of elderly people is ex¬ 
pected to grow. Indeed, the World Population Age¬ 
ing Report states that the global share of elderly peo¬ 
ple (aged 60 years or over) will reach more than 21% 
by 2050 (more than 2 billion people) (United Nations, 
2013). Ageing results from the demographic transition, 
a process where reductions in mortality are followed by 
reductions in fertility (United Nations, 2013; Carone 
and Costello, 2006). The increasing trend of life ex¬ 
pectancy has been directly proportional to the increase 
in disability (Karmarkar, 2009). Thus, oldest people 
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(a) 



(b) 

Fig. 1 Examples of accelerometer data: (a) A fall as acquired by a smartphone, (b) A walking activity from two different 
smartphones performed by two different subjects. 


represent the greatest challenge in providing health- 
related services and identifying ways to assist them in 
maintaining independence (Mann, 2004). Indeed, the 
31.2% of people aged 80 to 84, and 49.5 percent of 
those over age 85, require assistance with everyday ac¬ 
tivities (Federal Interagency Forum on Aging-Related 
Statistics - National Center for Health Statistics, 2012). 
This increment results in a growing need for supports 
(human or technological) that enable the older popu¬ 
lation to perform daily activities (US Census Bureau, 
2013). 

Intensive research efforts have been and are still fo¬ 
cused on the identification of solutions that from one 
side automatically assist elderly people in performing 
daily activities and, on the other side, promptly de¬ 


tect anomalous situations related to diseases or to situ¬ 
ations purely related to the old age, such as the worsen¬ 
ing of the mild cognitive impairment (Acampora et al, 
2013), the prompt identification of conditions favorable 
to heart failures (Deshmukh and Shilaskar, 2015), and 
the prompt detection of falls (Mubashir et al, 2013). 

Falls are a major health risk that impacts the qual¬ 
ity of life of elderly people. Among elderly people, ac¬ 
cidental falls occur frequently: the 30% of the over 65 
population falls at least once per year; the proportion 
increases rapidly with age (Tromp et al, 2001). More¬ 
over, fallers who are not able to get up more likely re¬ 
quire hospitalization or, even worse, die (Tinetti et al, 
1993). Thus, several approaches have been proposed to 
prompt detect falls. They mainly differ with respect to 
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(i) the sensors used to acquire data, (ii) the data repre¬ 
sentation (features) used by the method, and (iii) the 
method used to detect falls. 

Table 1 summarizes the analysis performed on a set 
of significative approaches. The table has been specifi¬ 
cally designed to highlight the characteristic features of 
each approach in terms of (i) sensors, (ii) data repre¬ 
sentation and (iii) methods. In particular, the first two 
columns show the method and the training set config¬ 
uration respectively. The third column states whether 
the approach requires a set of falls to train the algo¬ 
rithm. The fourth column lists the set of features used 
to infer a fall. Finally, the fifth and sixth columns re¬ 
spectively specify the type of wearable sensor used to 
sense data (ad hoc solutions or smartphone’s sensors) 
and the involved sensors. 

Table 1 aims at providing an idea of how many dif¬ 
ferent approaches are proposed. Most of the approaches 
rely on data coming from ad-hoc wearable sensing de¬ 
vices, only a few on smartphone’s sensors. The mainly 
used sensors are accelerometers. The approaches use 
features that are very different each other, some of them 
very complex in terms of computation. Half of the ap¬ 
proaches is based on thresholds-based techniques and 
the other half on machine learning techniques. Finally, 
most of the approaches are based on methods that re¬ 
quire a set of fall to train the underlying algorithm. 

Other approaches not outlined in Table 1 can be 
found in the many surveys dedicated to the fall de¬ 
tection (e.g., Mubashir et al (2013); Mohamed et al 
(2014); Hijaz et al (2010)). Among the others, Bagal 
et al (2012) is particularly interesting because it com¬ 
pares the most popular techniques for the identification 
of falls based on accelerometer data. 

2.1 Sensors and Data Representation 

Fall detection methods rely on data acquired by sens¬ 
ing devices. Images, accelerometer data, audio, angular 
velocities are only a few examples of data. Data are 
captured by environmental or wearable sensors or by 
a combination of both (Mubashir et al, 2013; Liming 
Chen et al, 2012). Ambient sensors introduce many is¬ 
sues such as privacy, installation costs, and invasiveness. 
Moreover, a person can fall everywhere. Thus, wearable 
senors are more indicated for the specific application 
domain. Under the umbrella of wearable sensors fall 
ad-hoc solutions and smartphones’ sensors. Ad-hoc so¬ 
lutions generally include a microcontroller and a set of 
attached sensors. Such artifacts are then placed in spe¬ 
cific area of the body (e.g., wrist, arm, ankle). Thus, 
they require an explicit acceptance by the elderly peo¬ 
ple. On the opposite, smartphones are generally present 


in everyday life. Therefore, the use of smartphones do 
not require changes in daily habits and do not involve 
additional costs. 

Despite the type of wearable device, most of the ap¬ 
proaches use accelerometers, a few accelerometers with 
gyroscopes. For this reason, our experimentation has 
considered accelerometers only. 

As regards features, Table 1 shows how the various 
approaches use features of different nature. Therefore, 
there not exists a common trend. The unique feature 
that is found with greater frequency is magnitude. 

It is possible to notice that some of the used features 
are generic such as the magnitude, the energy, and the 
standard deviation. Others are specifically related to 
the application domain, such as the time of free fall, 
the time of reverse impact, and the time of inactivity. 

Some of them are performing (such as, magnitude 
and Fourier transform), but require high processing times 
and/or considerable computing resources with respect 
to the application domain and, in case of smartphone, 
to the device on which the features will be calculated. 
Indeed, timeliness and lightness in the computation are 
crucial factors so that those features can be used with 
effectiveness on smartphones. 


2.2 Methods 

Methods can be divided into domain knowledge- and 
machine learning techniques-based (Mirchevska et al, 
2014): the former usually apply heuristics, while the lat¬ 
ter usually rely on the definition of classifiers able to de¬ 
tect falls. From our perspective, regardless of the type, 
what differentiates the techniques is the need for data 
representing falls in the data set used to train the al¬ 
gorithm. Most of the proposed approaches require falls 
in the training data set in order to properly configure 
their method. Falls are mostly realized relying on vol¬ 
unteers that are asked to perform daily activities (such 
as, sitting, walking, and so on) and to simulate falls. 

Even if the achieved results by those approaches are 
very promising, it is quite difficult to generalize the re¬ 
sults because almost always the experimentation is lim¬ 
ited to one ad-hoc data set only. In addition, as stated 
by Klenk et al (2011), simulated falls significantly dif¬ 
fer from real-world falls. Thus, having simulated falls in 
the training dataset could lead to realize classifiers that 
may show different behaviours with real-world falls. 

For the above considerations, a method based on the 
detection of anomalies with respect to ADL may be a 
better solution for this kind of application domain. Fall 
detection is not the only case in which the detection of 
anomalies is the better choice in designing a classifier. 
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Table 1 Related work 

Approach 

Method 

Falls 

needed? 

Features 

Smartphone 

Ad-hoc 

Sensors 

Medrano et al (2014) 

K-means+NN 

no 

- Magnitude 

Smartphone 

- Triaxial accelerometer 




- Magnitude of standard 






deviation per axis 



Tolkiehn et al (2011) 

Threshold based 

yes 

- Std of the magnitude 

- Ratio of the polar angle 

- Delta of two consecutive 
polar angles 

- Barometric pressure 

Ad-hoc 

- Triaxial accelerometer 

- Barometric pressure 

Wang et al (2014) 

Threshold based 

yes 

- Signal magnitude vector 

- Hearth rate value 

- Trunk angle 

Ad-hoc 

- Triaxial accelerometer 

- Hearth rate monitor 

Bourke et al (2007) 

Threshold based 

yes 

- Magnitude 

Ad-hoc 

- Dual-axis accelerometers 
placed orthogonally 

Li et al (2009) 

Threshold based 

yes 

- Magnitude of acceleration 

- Magnitude of angular 
velocity 

Ad-hoc 

- Triaxial accelerometer 

- Triaxial gyroscope 




- Time of free fall 

- Variance of acceleration 






during free fall 



Zhang et al (2006) 

One-class SVM 

yes 

- Time of reverse impact 

- Mean and variance of 

Ad-hoc 

- Triaxial accelerometer 




acceleration during 
reverse impact 



Chen et al (2006) 

Threshold based 

yes 

- Magnitude 

Ad-hoc 

- Dual-axis accelerometers 
placed orthogonally 




- Correlation coefficient 






between thigh and waist 



Nyan et al (2008) 

Threshold based 

yes 

deviation from vertical axis 

- Correlation coefficient 

Ad-hoc 

- Triaxial accelerometer 

- Two-axis gyroscope 




between angular velocity 
and reference template 






- Magnitude 



Abbate et al (2012) 

Threshold based 

yes 

- Time of inactivity 

- Peak time 

- Impact start 

- Impact end 

Smartphone 

Ad-hoc 

- Triaxial accelerometer 




- Inertial frame vertical 



Ge and Shuwan (2008) 

Threshold based 

yes 

acceleration 

- Inertial frame vertical 
velocity 

- Time of free fall 

Ad-hoc 

- Triaxial accelerometer 

- Two-axis gyroscope 

Mellone et al (2012) 

Threshold based 

yes 

- Acceleration sum vector 

- Vertical axis orientation 

Smartphone 

- Triaxial accelerometer 

Rabah et al (2012) 

Threshold based 

yes 

- Magnitude 

- Orientation 

Ad-hoc 

- Triaxial accelerometer 

Shibuya et al (2015) 

SVM 

yes 

- Range of angular velocity 

- Range of acceleration 

Ad-hoc 

- Triaxial accelerometer 

- Two-axis gyroscope 

- Single axe gyroscope 




- Magnitude 



Zhuang et al (2015) 

SVM 

yes 

- Ascending coefficient 

- Descending coefficient 

Ad-hoc 

- Triaxial accelerometer 

Sposaro and Tyson (2009) 

Threshold based 

yes 

- Magnitude 

- Angle of body 

Smartphone 

- Triaxial accelerometer 




- Magnitude 



Dai et al (2010) 

Threshold based 

yes 

- Ad-hoc feature 

- Shape Context 

- Hausdorff distance 

Smartphone 

Ad-hoc 

- Triaxial accelerometer 

- Magnetometer 

Lee and Carlisle (2011) 

Threshold based 

yes 

- Magnitude 

- Mean of magnitude 

Smartphone 

- Triaxial accelerometer 




- Moment (mean, abs(mean), 
std, skew, kurtosis 

- Moments of the difference 
between successive samples 

- Smoothed root mean squares 



Albert et al (2012) 

SVM 

yes 

- Extremes (min, max, 
abs(min), abs(max)) 

- Histogram 

- Fourier components 

- Mean magnitude, mean of 

Smartphone 

- Triaxial accelerometer 




cross products (xy, xz, yz), 
abs(mean of cross products) 



Fang et al (2012) 

Threshold based 

yes 

- Magnitude 

- Vertical acceleration value 

Smartphone 

- Triaxial accelerometer 

- Gyroscope 
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There are many other situations in which real-world 
data are very difficult to achieve: imagine a system able 
to infer terrorist attacks. Real world training sets are 
very rare or even not existing. From the related work 
analysis, only Medrano et al (2014) have assessed the 
robustness of one-class classifiers trained with a set of 
ADL. Indeed, Medrano et al (2014) agree with us stat¬ 
ing that “traditional approaches to this problem suffer 
from a high false positive rate, particularly, when the 
collected sensor data are biased toward normal data 
while the abnormal events are rare”. 

Medrano et al (2014) also experimented a two-classes 
SVM (Support Vector Machine). They concluded that 
SVM allows to obtain best results with respect to one- 
class kNN classifier in detecting anomalies. Although 
the goodness of the results, the training of a SVM would 
be more expensive in terms of computation resources 
than the training process of a kNN-based one-class clas¬ 
sifier. 


3 Experiment Design 

In this article we focus on methods that detect falls ex¬ 
ploiting smartphone accelerometer data. In particular, 
we evaluated the robustness of anomaly detectors com¬ 
pared to traditional detectors that, in turn, are tuned 
with fall instances. To this end, we designed several ex¬ 
perimental setups by varying both materials and meth¬ 
ods: 

— Data: we have created three different collections of 
smartphone accelerometer data by mixing the data 
of two publicly available smartphone accelerome¬ 
ter data (Medrano et al (2014) and Anguita et al 
(2013)) that have been recorded by different devices 
with different setups. We have created two sets of 
these collections selecting different sizes of time win¬ 
dow of the accelerometer data. Experimenting on 
these collections permits to assess the robustness 
with respect to changes in acquisition conditions. 

— Feature vectors: we experimented four different 
feature vectors ranging from the most simple to the 
most complex ones: raw data, magnitude, accelerom¬ 
eter features, local temporal patterns. Assessing the 
goodness of feature vectors is very meaningful espe¬ 
cially in a mobile and real time environment where 
the computational capacity may be limited. 

— Classification schema: we compared two differ¬ 
ent classification schemas based on the k-Nearest 
Neighbour (kNN) classifier (Fix and Hodges, 1951): 
one-class (corresponding to the anomaly detector) 
versus two-classes. Moreover, we also compared the 


results achieved with the one-class and the two- 
classes classification schemas based on the Support 
Vector Machines (SVM) classifier (Cortes and Vap- 
nik, 1995). 


3.1 Publicly Available Datasets 

One of the considered datasets contains both Activities 
of Daily Living (ADL) and falls performed by ten par¬ 
ticipants, 7 males and 3 females, ranging from 20 to 42 
years old (Medrano et al, 2014). The ADL set has been 
recorded during real-life conditions: participants carried 
a smartphone in their pocket for at least one week to 
record everyday behaviour. On average, about 800 ADL 
instances were collected from each subject during this 
period. Participants simulated eight different types of 
falls: forward falls, backward falls, left and right-lateral 
falls, syncope, sitting on empty chair, falls using com¬ 
pensation strategies to prevent the impact, and falls with 
contact to an obstacle before hitting the ground. Partic¬ 
ipants wore a smartphone in both their two pockets 
(left and right) and performed the falls on a soft mat¬ 
tress in a laboratory environment. They repeated each 
fall three times for a total of 24 fall simulations. The 
dataset contains 503 falls 1 and 7816 ADL. 

Accelerations have been recorded through the built- 
in triaxial accelerometer of a Samsung Galaxy Mini 
phone running the Android operating system version 
2.2. The sampling rate was not stable, with a value 
of about 45 ± 12 Hz. During the daily life monitoring, 
whenever a peak in the acceleration magnitude was de¬ 
tected to be higher than 1.5 g (g = gravity acceleration), 
a new data instance was recorded. Each data instance 
included information in a time window of 6 s around 
the peak. During each fall simulation, a 6 s width time 
window around the highest peak has been recorded. Af¬ 
terwards, the offset error of each axis was removed and 
an interpolation was performed to get a sample every 2 
ms (50 Hz). We will refer to this set of data as datasetl. 

The other dataset considered contains only ADL 
performed by a group of 30 volunteers with ages rang¬ 
ing from 19 to 48 years (Anguita et al, 2013). Each 
person was instructed to follow a protocol of 6 activ¬ 
ities: standing, sitting, laying down, walking, walking 
downstairs, and upstairs. Each subject performed the 
protocol twice while wearing a Samsung Galaxy S II 
smartphone: on the first trial the smartphone was fixed 
on the left side of the belt, and on the second it was 
placed by the user himself as preferred. The tasks were 

1 The authors declared that due to technical issues some 
falls had to be repeated in a few cases, so the number is 
higher than 24 x 2 x 10. 
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performed in laboratory conditions but volunteers were 
asked to perform freely the sequence of activities for 
a more naturalistic dataset. The accelerometer signals 
were pre-processed by applying noise filters and then 
sampled in fixed-width sliding windows of 2.56 s and 
50% overlap, thus obtaining 128 readings/window. The 
total number of accelerometer instances is 10299. We 
will refer to this set of data as dataset2. 

For the analysis presented in this paper, we con¬ 
sidered two different sub-windows of the accelerometer 
patterns taken around the peak. More in details, we 
considered two sub-windows of: 

— 2.56 s corresponding to a vector 128 samples; 

— Is corresponding to a vector of 51 values. 

3.2 Data Collections Description 

As discussed before, we have created three different 
collections of smartphone accelerometer data by mix¬ 
ing the data of two publicly available smartphone ac¬ 
celerometer data (Medrano et al (2014) and Anguita 
et al (2013)). For the evaluation we used a 10-fold cross- 
validation approach. The three data collections are then 
composed of 10 folds, each containing 90% of training 
data and 10% of test data. More in details: 

— Collection 1. ADL: 7035 training and 781 test data. 
FALL: 453 training and 50 test data. Both ADL and 
FALL data have been taken from the datasetl ; 

— Collection 2. ADL: 7035 training and 781 test data. 
Half of ADL data have been randomly taken from 
the datasetl and half from the dataset2\ FALL: 453 
training data and 50 test data. All the FALL data 
have been extracted from the datasetl ; 

— Collection 3: ADL: 9270 training data and 1029 test 
data. All the ADL data have been taken from the 
dataset2 ; 453 FALL training data and 50 FALL test 
data. All the FALL data have been extracted from 
the datasetl. 

3.3 Feature vectors 

As discussed before, we considered four different feature 
vectors extracted from two windows with different size: 
2.56 s corresponding to a vector 128 samples and 1 s 
corresponding to a vector of 51 values. More in details 
we considered: 

Raw data. This is the simplest representation. Each 
data instance is composed of the concatenation of the 
three accelerometer data (x,y,z), one for each direc¬ 
tion. We obtain a final feature vector of size 384 for the 
case of 128 samples and 153 for the case of 51 samples. 


Magnitude. This vector of features has been obtained 
from the three accelerometer data ( x , y, z) as follows: 

M = yj x 2 + y 2 + z 2 . 

We obtained a vector of size 128 and another of size 51. 
Accelerometer features. These feature vectors have 
been obtained by concatenating four different features 
for each direction: mean of the acceleration values , stan¬ 
dard deviation of the acceleration values , energy of the 
acceleration, and correlation of the acceleration. The di¬ 
mension of the final feature vector is of size 12. The 
energy of the acceleration is calculated as follows: 

Energy = 

where N is the number of samples of the acceleration 
patterns and a/fti are the fast Fourier transform com¬ 
ponents of the input patterns. The correlation of the 
acceleration is calculated between couple of directions: 
x versus y, x versus z, etc. 

Local Temporal Patterns. This feature is the most 
complex representation. The feature vector is composed 
of the concatenation of acceleration patterns achieved 
by comparing the magnitude of each sample ( M s ) with 
the several boosted magnitude values of N neighbour 
samples (Mf). The boosted magnitude values corre¬ 
sponding to a given neighbour i are achieved by in¬ 
creasing the original magnitude by an increasing deci¬ 
mal factor: 

M? = n + M t , 

with n ranging from 0 to M max . The value M max corre¬ 
sponds to the nearest decimal value of maximum mag¬ 
nitude. For each sample we have M max +1 comparisons 
as results of the following inequality: 

M s > Mf. 

The result of each comparison is represented as a binary 
vector map of size N with 1 indicating if the above 
inequality is satisfied and 0 if not. All the comparison 
maps are then summed in order to obtain a single vector 
map of size N. The number of neighbours has been 
set to N = 6. The final feature vector is obtained by 
concatenating all the maps achieved for each sample 
thus obtaining 6 x 51 = 306 and 6 x 128 = 768. 

3.4 Methods and Their Evaluation 

We have considered the one-class k-Nearest Neighbour 
(kNN) classifier and the one-class SVM classifier as 
anomaly detection techniques. These classifiers have been 
trained only with ADL instances and tested with both 



Daniela Micucci et al. 


ADL and FALL instances. Given a test instance, if the 
anomaly score is higher than a given threshold, the 
new instance is classified as an anomaly/fall, other¬ 
wise is classified as an ADL. By varying the thresh¬ 
old, the receiver operating characteristic curve (ROC) 
and the area under the ROC curve ( AUC ) can be 
obtained. We calculated also a specific value of sen¬ 
sitivity (SE = Tr ^:r ) and specificity (SP = 
Tr 'Negatives ^ )■ These values have been obtained by se¬ 
lecting the point that maximised their geometric mean 
y/SE ■ SP, in a ROC curve averaged over the cross- 
validation results. 

We compared the anomaly detectors with a two- 
classes kNN and a two-classes radial basis SVM. These 
classifiers have been trained and tested with both in¬ 
stances of ADL and FALL. We converted the distances 
achieved by the kNN into scores ranging from 0 to 1. 
By thresholding the scores we drawn the ROC curve. 

All the algorithms have been implemented in Mat- 
lab and tested with a PC equipped with the Ubuntu 
14.10 distribution. Regarding the two variants of kNN 
for each fold of the 10-cross validation we have per¬ 
formed an inner 10-cross validation for choosing the 
best number of neighbours k. We experimented 10 val¬ 
ues of k ranging from 1 to 10, and we used the Euclidean 
distance as distance measure. Regarding the SVM clas¬ 
sifier, we used the built-in Matlab package. For each fold 
of the 10-cross validation we have performed an inner 
10-cross validation for choosing the best regularisation 
and kernel parameters. The Matlab implementation of 
the SVM allows to achieve scores as outputs along with 
decisions. By thresholding these scores we obtain the 
corresponding ROC curve. 

In order to make the results reproducible, the data 
collections as well as the Matlab code used for the ex¬ 
periments are available on the authors’ website 2 . 

4 Results 

In the tables 2, 4, 6, 8 we report the results achieved 
on the three collections by all the classification schemas 
and feature vectors in the case of accelerometer data in¬ 
stances made of 128 samples. It is quite evident that the 
two-class SVM performs better than the other classifi¬ 
cation schemas especially over the first two collections. 
It is also quite clear that raw data and energy feature 
vectors perform better than the others with the raw 
data being the best. As can be noticed, the one-class 
and two-classes kNN achieve close performance in most 
cases. 

2 http: //www.sal.disco.unimib.it/research/ambient- 
assisted-living/ 


It is clear that the raw data (the simplest feature 
vector) is one of the best feature vectors independently 
of collection and classification schema (see Table 4 and 
Table 6). This result is a quite new to scientific com¬ 
munity since the most used features are usually more 
complex, such as magnitude and energy features. 

The results achieved on the third collection by all 
the feature vectors and classification schemas suggest 
that this collection is not challenging. In fact, this col¬ 
lection has been made by using the ADL from dataset2 
and the FALL from the datasetl. The results obtained 
on this collection depends on the fact that the experi¬ 
mentation protocol of the underlying datasets is quite 
different. In fact, in the case of the datasetl the ADL 
were obtained by recording at least one week of daily 
life, while in the case of the dataset2, each person was 
instructed to perform 6 ADL in a laboratory environ¬ 
ment. The difference between the ADL data in the two 
datasets makes the classification problem too easy. 

In the tables 3, 5, 7, 9 we report the results achieved 
on the three collections by all the classification schemas 
and feature vectors in the case of accelerometer data 
made of 51 samples. Here, the one-class and two-classes 
classifiers achieve close performance in every cases. It 
should be noticed that the performance achieved in the 
case of the 51 samples by all the kNN based solutions is 
better than the case of 128 samples. Moreover, even in 
this case, raw data demonstrated to work better than 
or comparable with more complex feature vectors. 

The results achieved by the SVM classifier in the 
case of 128 samples make clear that the two-classes clas¬ 
sifier performs better than a novelty detector. This is 
not true in the case of 51 samples where we demon¬ 
strated that using raw data the gap between SVM and 
the novelty detector is very small. This results over¬ 
come the results achieved by Medrano et al (2014). 
They demonstrated that a two-classes SVM is much 
better than a novelty detector when the accelerometer 
data is represented as magnitude and is composed of 51 
samples. This result is more visible looking at the table 
10. This table includes the best results achieved by a 
two-classes and a one-class classifier. It is quite evident 
that the novelty detector, based on the kNN classifier, 
achieves a performance that is about 10% less than the 
the two-classes SVM classifier. The ROC curves that 
compare the best one-class versus the best two-classes 
classifier performed on the collection 1 and 2 are showed 
in Fig. 2. Also from these figures is quite evident that 
the performance of the novelty detector is very close to 
the one of the two-classes classifier. 
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Fig. 2 ROC curves corresponding to the comparison between the best novelty detector and the best two-classes detector, (a) 
Collection 1. (b) Collection 2. 


5 Discussion and Conclusion 

In this work we evaluated the robustness of anomaly 
detectors (one-class classifier) compared to that of tra¬ 
ditional two-classes detection methods that, in turn, 
are tuned with fall instances. To this end, we experi¬ 
mented several methods on three different collections 
of accelerometer data, and four different feature vec¬ 
tors. The experiments have demonstrated that: 

— a very simple feature vector based on raw data is 
very robust to detect falls in both one or two classes 
schemas; 

— a greater number of samples of acceleration instances 
penalises kNN classification schemas. In contrast, 
the SVM classifier does not seem to suffer from 
changes in the number of samples. This makes the 
one-class kNN classifier more feasible in case of 51 
samples; 

— in the case of 128 samples a novelty detector is reli¬ 
able only if it is based on raw data. In the case of 51 
samples a novelty detector is reliable if it is based 
on both raw data and magnitude; 

Overall, considering that in the case of raw data, the 
gap between the SVM and one class kNN is very small, 
we can conclude that a fall detection system based on 
a novelty detector is feasible in a real scenario. This is 
especially true considering the limited computation ca¬ 
pacity and power resources of the smartphone. In fact, 
the raw data does not require further processing and the 
kNN schema is based on a simple Euclidean distance. 


6 Future Directions 

In order to further validate the robustness of our ap¬ 
proach, we should be able to experiment with addi¬ 
tional datasets. These datasets should contain ADL 
performed by different people and recorded by differ¬ 
ent smartphones. 

As the number of data sets freely available is ex¬ 
tremely reduced, we decided to develop an application 
that is able to acquire data from smartphones’ sensors 
and to automatically label them (falls or ADL). This 
enables us to enrich the datasets of ADL and of simu¬ 
lated falls. 
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to improve the paper. 
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c. 

Feat. 

Class. 

AUC 

SE 

SP 

VSE■SP 



1-knn 

0.955 

0.950 

0.899 

0.924 


RAW 

2-knn 

0.969 

0.932 

0.927 

0.929 



1-svm 

0.939 

0.936 

0.867 

0.901 



2-svm 

0.989 

0.964 

0.981 

0.972 



1-knn 

0.822 

0.900 

0.656 

0.768 


Magn. 

2-knn 

0.876 

0.834 

0.761 

0.794 



1-svm 

0.765 

0.880 

0.615 

0.735 

1 


2-svm 

0.977 

0.924 

0.954 

0.939 



1-knn 

0.811 

0.870 

0.695 

0.777 


Energ. 

2-knn 

0.925 

0.910 

0.890 

0.900 



1-svm 

0.843 

0.856 

0.768 

0.811 



2-svm 

0.988 

0.970 

0.958 

0.964 



1-knn 

0.817 

0.850 

0.668 

0.754 


LTP 

2-knn 

0.853 

0.876 

0.697 

0.780 



1-svm 

0.793 

0.846 

0.637 

0.734 



2-svm 

0.974 

0.910 

0.940 

0.925 



1-knn 

0.977 

0.960 

0.933 

0.947 


RAW 

2-knn 

0.984 

0.958 

0.947 

0.952 



1-svm 

0.971 

0.942 

0.928 

0.935 



2-svm 

0.992 

0.976 

0.986 

0.981 



1-knn 

0.885 

0.910 

0.762 

0.833 


Magn. 

2-knn 

0.928 

0.898 

0.814 

0.854 



1-svm 

0.804 

0.908 

0.661 

0.775 

2 


2-svm 

0.984 

0.944 

0.957 

0.950 



1-knn 

0.891 

0.870 

0.803 

0.836 


Energ. 

2-knn 

0.916 

0.860 

0.878 

0.868 



1-svm 

0.918 

0.910 

0.851 

0.880 



2-svm 

0.990 

0.968 

0.983 

0.975 



1-knn 

0.885 

0.910 

0.754 

0.828 


LTP 

2-knn 

0.917 

0.930 

0.779 

0.850 



1-svm 

0.842 

0.898 

0.684 

0.784 



2-svm 

0.984 

0.926 

0.956 

0.941 



1-knn 

0.998 

0.990 

0.984 

0.987 


RAW 

2-knn 

0.997 

0.996 

0.993 

0.994 



1-svm 

0.998 

0.986 

0.987 

0.986 



2-svm 

0.999 

1.000 

0.974 

0.987 



1-knn 

0.991 

0.990 

0.972 

0.981 


Magn. 

2-knn 

0.998 

0.996 

0.992 

0.994 



1-svm 

0.637 

0.996 

0.600 

0.773 

3 


2-svm 

1.000 

1.000 

0.997 

0.999 



1-knn 

0.898 

0.890 

0.767 

0.826 


Energ. 

2-knn 

0.930 

0.856 

0.907 

0.880 



1-svm 

0.996 

0.998 

0.974 

0.986 



2-svm 

1.000 

1.000 

0.999 

1.000 



1-knn 0.988 

0.970 

0.964 

0.967 


LTP 

2-knn 

0.997 

0.990 

0.980 

0.985 



1-svm 

0.980 

0.944 

0.914 

0.929 



2-svm 

1.000 

0.996 

0.999 

0.997 


Table 2 Results obtained by both kNN and SVM schemas on 
the three collections. Here the accelerometer data contain 128 
samples. Best result for each collection is reported in bold. 


c. 

Feat. 

Class. 

AUC 

SE 

SP 

VSE■SP 



1-knn 

0.980 

0.980 

0.940 

0.960 


RAW 

2-knn 

0.983 

0.962 

0.952 

0.957 



1-svm 

0.954 

0.946 

0.903 

0.924 



2-svm 

0.986 

0.954 

0.969 

0.961 



1-knn 

0.958 

0.910 

0.923 

0.916 


Magn. 

2-knn 

0.961 

0.910 

0.929 

0.919 



1-svm 

0.911 

0.870 

0.844 

0.857 

1 


2-svm 

0.967 

0.904 

0.948 

0.926 



1-knn 

0.811 

0.870 

0.695 

0.777 


Energ. 

2-knn 

0.925 

0.910 

0.890 

0.900 



1-svm 

0.843 

0.856 

0.768 

0.811 



2-svm 

0.988 

0.970 

0.958 

0.964 



1-knn 

0.936 

0.860 

0.889 

0.875 


LTP 

2-knn 

0.942 

0.882 

0.891 

0.885 



1-svm 

0.890 

0.826 

0.820 

0.823 



2-svm 

0.959 

0.890 

0.923 

0.906 



1-knn 

0.988 

0.960 

0.978 

0.969 


RAW 

2-knn 

0.990 

0.964 

0.977 

0.970 



1-svm 

0.979 

0.948 

0.954 

0.951 



2-svm 

0.990 

0.960 

0.981 

0.970 



1-knn 

0.970 

0.940 

0.926 

0.933 


Magn. 

2-knn 

0.976 

0.942 

0.938 

0.940 



1-svm 

0.942 

0.894 

0.879 

0.887 

2 


2-svm 

0.984 

0.952 

0.955 

0.953 



1-knn 

0.891 

0.870 

0.803 

0.836 


Energ. 

2-knn 

0.916 

0.860 

0.878 

0.868 



1-svm 

0.918 

0.910 

0.851 

0.880 



2-svm 

0.990 

0.972 

0.988 

0.980 



1-knn 

0.958 

0.900 

0.902 

0.901 


LTP 

2-knn 

0.964 

0.938 

0.894 

0.915 



1-svm 

0.930 

0.900 

0.832 

0.865 



2-svm 

0.980 

0.934 

0.933 

0.934 



1-knn 

0.998 

1.000 

0.974 

0.987 


RAW 

2-knn 

0.997 

1.000 

0.995 

0.998 



1-svm 

0.999 

0.986 

0.997 

0.992 



2-svm 

1.000 

0.996 

0.985 

0.991 



1-knn 

0.996 

0.990 

0.993 

0.991 


Magn. 

2-knn 

0.998 

0.998 

0.991 

0.994 



1-svm 

0.763 

0.998 

0.741 

0.860 

3 


2-svm 

0.999 

0.998 

0.991 

0.995 



1-knn 

0.898 

0.890 

0.767 

0.826 


Energ. 

2-knn 

0.930 

0.856 

0.907 

0.880 



1-svm 

0.996 

0.998 

0.974 

0.986 



2-svm 

1.000 

1.000 

0.999 

1.000 



1-knn 

0.996 

0.960 

0.991 

0.975 


LTP 

2-knn 

0.997 

0.986 

0.987 

0.987 



1-svm 

0.997 

0.986 

0.978 

0.982 



2-svm 

1.000 

0.998 

0.989 

0.994 


Table 3 Results obtained by both kNN and SVM schemas 
on the three collections. Here the accelerometer data contain 
51 samples. Best result for each collection is reported in bold. 
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Features 

Class. 

AUC 

SE 

SP 

VSE■SP 


1-knn 

0.977 

0.967 

0.939 

0.953 

RAW 

2-knn 

0.984 

0.962 

0.955 

0.959 

1-svm 

0.969 

0.955 

0.927 

0.941 


2-svm 

0.993 

0.980 

0.980 

0.980 


1-knn 

0.900 

0.933 

0.797 

0.861 

Magn. 

2-knn 

0.934 

0.909 

0.855 

0.881 

1-svm 

0.736 

0.928 

0.625 

0.761 


2-svm 

0.987 

0.956 

0.969 

0.963 


1-knn 

0.867 

0.877 

0.755 

0.813 

Energ. 

2-knn 

0.924 

0.875 

0.892 

0.883 

1-svm 

0.919 

0.921 

0.864 

0.892 


2-svm 

0.993 

0.979 

0.980 

0.980 


1-knn 

0.897 

0.910 

0.795 

0.850 

LTP 

2-knn 

0.922 

0.932 

0.819 

0.872 

1-svm 

0.872 

0.896 

0.745 

0.815 


2-svm 

0.986 

0.944 

0.965 

0.954 


Table 4 Average results obtained by both kNN and SVM 
schemas with respect to the three collections. Here the ac¬ 
celerometer data contain 128 samples. 


Features Class. AUC SE SP VSE ■ SP 



1-knn 

0.989 

0.980 

0.964 

0.972 

RAW 

2-knn 

0.990 

0.975 

0.975 

0.975 

1-svm 

0.977 

0.960 

0.952 

0.956 


2-svm 

0.992 

0.970 

0.978 

0.974 


1-knn 

0.975 

0.947 

0.947 

0.947 

Magn. 

2-knn 

0.978 

0.950 

0.953 

0.951 

1-svm 

0.872 

0.921 

0.822 

0.868 


2-svm 

0.983 

0.951 

0.965 

0.958 


1-knn 

0.867 

0.877 

0.755 

0.813 

Energ. 

2-knn 

0.924 

0.875 

0.892 

0.883 

1-svm 

0.919 

0.921 

0.864 

0.892 


2-svm 

0.993 

0.981 

0.982 

0.981 


1-knn 

0.964 

0.907 

0.927 

0.917 

LTP 

2-knn 

0.968 

0.935 

0.924 

0.929 

1-svm 

0.939 

0.904 

0.877 

0.890 


2-svm 

0.979 

0.941 

0.948 

0.944 

Table 5 Average results obtained by both kNN and SVM 

schemas with respect to the three collections. 

Here the ac- 


celerometer data contain 51 samples. 
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Collect. 

AUC 

SE 

SP 

VSE■SP 


1 

0.963 

0.946 

0.918 

0.932 

RAW 

2 

0.981 

0.959 

0.949 

0.954 

3 

0.998 

0.993 

0.984 

0.989 


1 

0.860 

0.884 

0.746 

0.809 

Magn. 

2 

0.900 

0.915 

0.799 

0.853 

3 

0.906 

0.996 

0.890 

0.937 


1 

0.892 

0.901 

0.828 

0.863 

Energ. 

2 

0.929 

0.902 

0.879 

0.890 

3 

0.956 

0.936 

0.912 

0.923 


1 

0.859 

0.871 

0.736 

0.798 

LTP 

2 

0.907 

0.916 

0.793 

0.851 

3 

0.991 

0.975 

0.964 

0.969 


Table 6 Average results obtained by both kNN and SVM 
schemas with respect to the four data representation. Here 
the accelerometer data contain 128 samples. 


Features 

Collect. 

AUC 

SE 

SP 

VSE■SP 


1 

0.976 

0.960 

0.941 

0.951 

RAW 

2 

0.987 

0.958 

0.973 

0.965 

3 

0.998 

0.995 

0.988 

0.992 


1 

0.949 

0.898 

0.911 

0.905 

Magn. 

2 

0.968 

0.932 

0.924 

0.928 

3 

0.939 

0.996 

0.929 

0.960 


1 

0.892 

0.901 

0.828 

0.863 

Energ. 

2 

0.929 

0.903 

0.880 

0.891 

3 

0.956 

0.936 

0.912 

0.923 


1 

0.932 

0.865 

0.881 

0.872 

LTP 

2 

0.958 

0.918 

0.890 

0.904 

3 

0.998 

0.982 

0.986 

0.984 


Table 7 Average results obtained by both kNN and SVM 
schemas with respect to the four data representation. Here 
the accelerometer data contain 51 samples. 


Class. 

AUC 

SE 

SP 

VSE■SP 

1-knn 

0.910 

0.922 

0.822 

0.869 

2-knn 

0.941 

0.920 

0.880 

0.898 

1-svm 

0.874 

0.925 

0.790 

0.852 

2-svm 

0.990 

0.965 

0.974 

0.969 


Table 8 Average results obtained by both all classifiers with 
respect to the four data representation and the three collec¬ 
tions. Here the accelerometer data contain 128 samples. 


Class. 

AUC 

SE 

SP 

VSE■SP 

1-knn 

0.948 

0.927 

0.898 

0.912 

2-knn 

0.965 

0.934 

0.936 

0.934 

1-svm 

0.927 

0.926 

0.879 

0.901 

2-svm 

0.987 

0.961 

0.968 

0.964 


Table 9 Average results obtained by both all classifiers with 
respect to the four data representation and the three collec¬ 
tions. Here the accelerometer data contain 51 samples. 
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RAW(128) 

RAW(51) 

2-svm 

1-knn 

0.992 

0.988 

0.976 

0.960 

0.986 

0.978 

0.981 

0.969 

3 

Energ.(128) 

RAW(51) 

2-svm 

1-svm 

1.000 

0.999 

1.000 

0.986 

0.999 

0.997 

1.000 

0.992 


Table 10 Results obtained by the best feature vectors in the 
case of both the two-classes and one-class classifiers. 
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